Search Results: "Michael Stapelberg"

26 June 2017

Niels Thykier: debhelper 10.5.1 now available in unstable

Earlier today, I uploaded debhelper version 10.5.1 to unstable. The following are some highlights compared to version 10.2.5: There are also some changes to the upcoming compat 11
Filed under: Debhelper, Debian

9 April 2017

Michael Stapelberg: manpages.debian.org: what s new since the launch?

On 2017-01-18, I announced that https://manpages.debian.org had been modernized. Let me catch you up on a few things which happened in the meantime: The list above is not complete, but rather a selection of things I found worth pointing out to the larger public. There are still a few things I plan to work on soon, so stay tuned :).

22 March 2017

Michael Stapelberg: Debian stretch on the Raspberry Pi 3 (update)

I previously wrote about my Debian stretch preview image for the Raspberry Pi 3. Now, I m publishing an updated version, containing the following changes: A couple of issues remain, notably the lack of HDMI, WiFi and bluetooth support (see wiki:RaspberryPi3 for details. Any help with fixing these issues is very welcome! As a preview version (i.e. unofficial, unsupported, etc.) until all the necessary bits and pieces are in place to build images in a proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/2017-03-22/. To install the image, insert the SD card into your computer (I m assuming it s available as /dev/sdb) and copy the image onto it:
$ wget https://people.debian.org/~stapelberg/raspberrypi3/2017-03-22/2017-03-22-raspberry-pi-3-stretch-PREVIEW.img
$ sudo dd if=2017-03-22-raspberry-pi-3-stretch-PREVIEW.img of=/dev/sdb bs=5M
If resolving client-supplied DHCP hostnames works in your network, you should be able to log into the Raspberry Pi 3 using SSH after booting it:
$ ssh root@rpi3
# Password is  raspberry 

1 February 2017

Antoine Beaupr : My free software activities, January 2017

manpages.debian.org launched The debmans package I had so lovingly worked on last month is now officially abandoned. It turns out that another developer, Michael Stapelberg wrote his own implementation from scratch, called debiman. Both software share a similar design: they are both static site generators that parse an existing archive and call another tool to convert manpages into HTML. We even both settled on the same converter (mdoc). But while I wrote debmans in Python, debiman is written in Go. debiman also seems much faster, being written with concurrency in mind from the start. Finally, debiman is more feature complete: it properly deals with conflicting packages, localization and all sorts redirections. Heck, it even has a pretty logo, how can I compete? While debmans was written first and was in the process of being deployed, I had to give it up. It was a frustrating experience because I felt I wasted a lot of time working on software that ended up being discarded, especially because I put so much work on it, creating extensive documentation, an almost complete test suite and even filing a detailed core infrastructure best practices report In the end, I think that was the right choice: debiman seemed clearly superior and the best tool should win. Plus, it meant less work for me: Michael and Javier (the previous manpages.debian.org maintainer) did all the work of putting the site online. I also learned a lot about the CII best practices program, flask, click and, ultimately, the Go programming language itself, which I'll refer to as Golang for brievity. debiman definitely brought Golang into the spotlight for me. I had looked at Go before, but it seemed to be yet another language. But seeing Michael beat me to rebuilding the service really made me look at it again more seriously. While I really appreciate Python and I will probably still use it as my language of choice for GUI work and smaller scripts, but for daemons, network programs and servers, I will seriously consider Golang in the future. The site is now online at https://manpages.debian.org/. I even got credited in the about page which makes up for the disappointment.

Wallabako downloads Wallabag articles on my Kobo e-reader This obviously brings me to the latest project I worked on, Wallabako, my first Golang program ever. Wallabako is basically a client for the Wallabag application, which is a free software "read it later" service, an alternative to the likes of Pocket, Pinboard or Evernote. Back in April, I had looked downloading my "unread articles" into my new ebook reader, going through convoluted ways like implementing OPDS support into Wallabag, which turned out to be too difficult. Instead, I used this as an opportunity to learn Golang. After reading the quite readable golang specification over the weekend, I found the language to be quite elegant and simple, yet very powerful. Golang feels like C, but built with concurrency and memory (and to a certain extent, type) safety in mind, along with a novel approach to OO programming. The fact that everything can be compiled in one neat little static binary was also a key feature in selecting golang for this project, as I do not have much control over the platform my E-Reader is running: it is a Linux machine running under the ARM architecture, but beyond that, there isn't much available. I couldn't afford to ship a Python interpreter in there and while there are solutions there like pyinstaller, I felt that it may be so easy to deploy on ARM. The borg team had trouble building a ARM binary, restoring to tricks like building on a Raspberry PI or inside an emulator. In comparison, the native go compiler supports cross-compilation out of the box through a simple environment variable. So far Wallabako works amazingly well: when I "bag" a new article in Wallabag, either from my phone or my web browser, it will show up on my ebook reader then next time I open the wifi. I still need to "tap" the screen to fake the insertion of the USB cable, but we're working on automating that. I also need to make the installation of the software much easier and improve the documentation, because so far it's unlikely that someone unfamiliar with Kobo hardware hacking will be able to install it.

Other work According to Github, I filed a bunch of bugs all over the place (25 issues in 16 repositories), sent patches everywhere (13 pull requests in 6 repositories), and tried to fix everythin (created 38 commits in 7 repositories). Note that excludes most of my work, which happens on Gitlab. January was still a very busy month, especially considering I had an accident which kept me mostly offline for about a week. Here are some details on specific projects.

Stressant and a new computer I revived the stressant project and got a new computer. This is be covered in a separate article.

Linkchecker forked After much discussions, it was decided to fork the linkchecker project, which now lives in its own organization. I still have to write community guidelines and figure out the best way to maintain a stable branch, but I am hopeful that the community will pick up the project as multiple people volunteer to co-maintain the project. There has already been pull requests and issues reported, so that's a good sign.

Feed2tweet refresh I re-rolled my pull requests to the feed2tweet project: last time they were closed before I had time to rebase them. The author was okay with me re-submitting them, but he hasn't commented, reviewed or merged the patches yet so I am worried they will be dropped again. At that point, I would more likely rewrite this from scratch than try to collaborate with someone that is clearly not interested in doing so...

Debian uploads

Debian Long Term Support (LTS) This is my 10th month working on Debian LTS, started by Raphael Hertzog at Freexian. I took two months off last summer, which means it's actually been a year of work on the LTS project. This month I worked on a few issues, but they were big issues, so they took a lot of time. I have done a lot of work trying to backport the heading sanitization patches for CVE-2016-8743. The full report explain all the gritty details, but I ran out of time and couldn't upload the final version either. The issue mostly affects Apache servers in proxy configurations so it's not so severe as to warrant an immediate upload anyways. A lot of my time was spent battling the tiff package. The report mentions fixes for 15 CVEs and I uploaded the result in the DLA-795-1 advisory. I also worked on a small update to graphics magic for CVE-2016-9830 that is still pending because the issue is minor and we're waiting for more to pile up. See the full report for details. Finally, there was a small discussion surrounding tools to use when building and testing update to LTS packages. The resulting conversation was interesting, but it showed that we have a big documentation problem in the Debian project. There are a lot of tools, and the documentation is old and distributed everywhere. Every time I want to contribute something to the documentation, I never know where to start or go. This is why I wrote a separate debian development guide instead of contributing to existing documentation...

18 January 2017

Michael Stapelberg: manpages.debian.org has been modernized

https://manpages.debian.org has been modernized! We have just launched a major update to our manpage repository. What used to be served via a CGI script is now a statically generated website, and therefore blazingly fast. While we were at it, we have restructured the paths so that we can serve all manpages, even those whose name conflicts with other binary packages (e.g. crontab(5) from cron, bcron or systemd-cron). Don t worry: the old URLs are redirected correctly. Furthermore, the design of the site has been updated and now includes navigation panels that allow quick access to the manpage in other Debian versions, other binary packages, other sections and other languages. Speaking of languages, the site serves manpages in all their available languages and respects your browser s language when redirecting or following a cross-reference. Much like the Debian package tracker, manpages.debian.org includes packages from Debian oldstable, oldstable-backports, stable, stable-backports, testing and unstable. New manpages should make their way onto manpages.debian.org within a few hours. The generator program ( debiman ) is open source and can be found at https://github.com/Debian/debiman. In case you would like to use it to run a similar manpage repository (or convert your existing manpage repository to it), we d love to help you out; just send an email to stapelberg AT debian DOT org. This effort is standing on the shoulders of giants: check out https://manpages.debian.org/about.html for a list of people we thank. We d love to hear your feedback and thoughts. Either contact us via an issue on https://github.com/Debian/debiman/issues/, or send an email to the debian-doc mailing list (see https://lists.debian.org/debian-doc/).

25 November 2016

Michael Stapelberg: Debian package build tools

Personally, I find the packaging tools which are available in Debian far too complex. To better understand the options we have, I created a diagram of tools which are frequently used, only covering the build step (i.e. no post-build quality assurance checks or packaging-time helpers): debian package build tools When I was first introduced to Debian packaging, people recommended I use pbuilder. Given how complex the toolchain is in the pbuilder case, I don t understand why that is (was?) a common recommendation. Back in August 2015, so well over a year ago, I switched to sbuild, motivated by how much simpler it was to implement ratt (rebuilds reverse build dependencies) using sbuild, and I have not looked back. Are there people who do not use sbuild for reasons other than familiarity? If so, please let me know, I d like to understand. I also made a version of the diagram above, colored by the programming languages in which the tools are implemented. The chosen colors are heavily biased :-). debian package build tools, by language To me, the diagram above means: if you want to make substantial changes to the Debian build tool infrastructure, you need to become an expert in all of Python, Perl, Bash, C and Make. I know that this is not true for every change, but it still irks me that there might be changes for which it is required. I propose to eliminate complexity in Debian by deprecating the pbuilder toolchain in favor of sbuild.

24 November 2016

Michael Stapelberg: Debian stretch on the Raspberry Pi 3

The last couple of days, I worked on getting Debian to run on the Raspberry Pi 3. Thanks to the work of many talented people, the Linux kernel in version 4.8 is _almost_ ready to run on the Raspberry Pi 3. The only missing thing is the bcm2835 MMC driver, which is required to read the root file system from the SD card. I ve asked our maintainers to include the patch for the time being. Aside from the kernel, one also needs a working bootloader, hence I used Ubuntu s linux-firmware-raspi2 package and uploaded the linux-firmware-raspi3 package to Debian. The package is currently in the NEW queue and needs to be accepted by ftp-master before entering Debian. The most popular method of providing a Linux distribution for the Raspberry Pi is to provide an image that can be written to an SD card. I made two little changes to vmdebootstrap (#845439, #845526) which make it easier to create such an image. The Debian wiki page https://wiki.debian.org/RaspberryPi3 describes the current state of affairs and should be updated, as this blog post will not be updated. As a preview version (i.e. unofficial, unsupported, etc.) until all the necessary bits and pieces are in place to build images in a proper place in Debian, I built and uploaded the resulting image. Find it at https://people.debian.org/~stapelberg/raspberrypi3/. To install the image, insert the SD card into your computer (I m assuming it s available as /dev/sdb) and copy the image onto it:
$ wget https://people.debian.org/~stapelberg/raspberrypi3/2016-11-24-raspberry-pi-3-stretch-PREVIEW.img
$ sudo dd if=2016-11-24-raspberry-pi-3-stretch-PREVIEW.img of=/dev/sdb bs=5M
I hope this initial work on getting Debian booted will motivate other people to contribute little improvements here and there. A list of current limitations and potential improvements can be found on the RaspberryPi3 Debian wiki page.

6 October 2016

Reproducible builds folks: Reproducible Builds: week 75 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday September 25 and Saturday October 1 2016: Statistics For the first time, we reached 91% reproducible packages in our testing framework on testing/amd64 using a determistic build path. (This is what we recommend to make packages in Stretch reproducible.) For unstable/amd64, where we additionally test for reproducibility across different build paths we are at almost 76% again. IRC meetings We have a poll to set a time for a new regular IRC meeting. If you would like to attend, please input your available times and we will try to accommodate for you. There was a trial IRC meeting on Friday, 2016-09-31 1800 UTC. Unfortunately, we did not activate meetbot. Despite this participants consider the meeting a success as several topics where discussed (eg changes to IRC notifications of tests.r-b.o) and the meeting stayed within one our length. Upcoming events Reproduce and Verify Filesystems - Vincent Batts, Red Hat - Berlin (Germany), 5th October, 14:30 - 15:20 @ LinuxCon + ContainerCon Europe 2016. From Reproducible Debian builds to Reproducible OpenWrt, LEDE & coreboot - Holger "h01ger" Levsen and Alexander "lynxis" Couzens - Berlin (Germany), 13th October, 11:00 - 11:25 @ OpenWrt Summit 2016. Introduction to Reproducible Builds - Vagrant Cascadian will be presenting at the SeaGL.org Conference In Seattle (USA), November 11th-12th, 2016. Previous events GHC Determinism - Bartosz Nitka, Facebook - Nara (Japan), 24th September, ICPF 2016. Toolchain development and fixes Michael Meskes uploaded bsdmainutils/9.0.11 to unstable with a fix for #830259 based on Reiner Herrmann's patch. This fixed locale_dependent_symbol_order_by_lorder issue in the affected packages (freebsd-libs, mmh). devscripts/2.16.8 was uploaded to unstable. It includes a debrepro script by Antonio Terceiro which is similar in purpose to reprotest but more lightweight; specific to Debian packages and without support for virtual servers or configurable variations. Packages reviewed and fixed, and bugs filed The following updated packages have become reproducible in our testing framework after being fixed: The following updated packages appear to be reproducible now for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 77 package reviews have been added, 178 have been updated and 80 have been removed in this week, adding to our knowledge about identified issues. 6 issue types have been updated: Weekly QA work As part of reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development A new version of diffoscope 61 was uploaded to unstable by Chris Lamb. It included contributions from: Post-release there were further contributions from: reprotest development A new version of reprotest 0.3.2 was uploaded to unstable by Ximin Luo. It included contributions from: Post-release there were further contributions from: tests.reproducible-builds.org Misc. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

8 August 2016

Michael Stapelberg: Debian Code Search: improving client-side latency

A while ago, it occurred to me that querying Debian Code Search seemed slow, which surprised me because I previously spent quite some effort on making it faster, see Debian Code Search Instant and Taming the latency tail for the most recent substantial architecture overhaul and related optimizations. Upon taking a closer look, I realized that while performing the search query on the server side was pretty fast, the perceived slowness was due to the client side being slow. By being slow , I mean that it took a long time until something was drawn on the screen (high latency) and that what was happening on the screen was janky (stuttering, not smooth). Part of that slowness was due to historical reasons: the client-side architecture was optimized for the use-case where users open Debian Code Search s index page and then submit a search query, but I was using Chrome s address bar to send a search query (type codesearch , then hit the TAB key). Further, we only added a non-JavaScript version after we launched the JavaScript version. Hence, the redirects and progressive enhancements we implemented are more of a kludge than a well thought out design. After this bit of original investigation, I opened GitHub issue #69 to track the work on making Debian Code Search faster. In that issue, I captured how Chrome s network inspector visualizes the work necessary to render the page: Chrome network inspector: before A couple of quick wins There are a couple of little fixes and improvements on which I m not going to spend too much time on, but which I list for completeness anyway just in case they come in handy for a similar project of yours: Bigger changes The URL pattern has changed. Previously, we had 2 areas of the website, one for JavaScript-compatible clients and one for the rest. When you hit the wrong one, you were redirected. In some areas, we couldn t tell which area is the correct one for you, so you would always incur a redirect: one example for this is the search bar. With the new URL pattern, we deliver both versions under the same URL: the elements only used by the JavaScript code are hidden using CSS by default, then made visible by JavaScript code. The elements only used by the non-JavaScript code are wrapped in a <noscript> tag. All CSS which is required for the initial page rendering is now inlined in the responses, allowing the browser to immediately render a response without requiring any additional round trips. All non-essential CSS has been moved into a separate CSS file which is loaded asynchronously. This is done using a pattern like <link rel="preload" href="foo.css" as="style" onload="this.rel='stylesheet'">, see also filamentgroup/loadCSS. We switched from WebSockets to the EventSource API because the former is not compatible with HTTP/2, whereas the latter is. This removes a round trip and some custom code for WebSocket reconnecting, because EventSource does that for you. The progress bar animation used to animate the background-position property. It turns out that browsers can only animate the position, scale, rotation and opacity properties smoothly, because such animations can be off-loaded to the GPU. Hence, we have re-implemented the progress bar animation using the position property. The biggest win for improving client-side latency from the Chrome address bar was introducing Service Workers (see commit 7f31aef402cb782056e290a797f224171f4af270). Our Service Worker caches static assets and a placeholder results page. The placeholder page is presented immediately when you start a search (e.g. from the address bar), making the first response immediate, i.e. rendered within 100ms. Having assets and the result page out of the way, the first round trip is used for actually doing the search, removing all unnecessary overhead. With all of these improvements in place, rendering latency goes down from half a second to well under 100 ms, and this is what the Chrome network inspector looks like: Chrome network inspector: after

17 July 2016

Michael Stapelberg: mergebot: easily merging contributions

Recently, I was wondering why I was pushing off accepting contributions in Debian for longer than in other projects. It occurred to me that the effort to accept a contribution in Debian is way higher than in other FOSS projects. My remaining FOSS projects are on GitHub, where I can just click the Merge button after deciding a contribution looks good. In Debian, merging is actually a lot of work: I need to clone the repository, configure it, merge the patch, update the changelog, build and upload. I wondered how close we can bring Debian to a model where accepting a contribution is just a single click as well. In principle, I think it can be done. To demonstrate the feasibility and collect some feedback, I wrote a program called mergebot. The first stage is done: mergebot can be used on your local machine as a command-line tool. You provide it with the source package and bug number which contains the patch in question, and it will do the rest:
midna ~ $ mergebot -source_package=wit -bug=#831331
2016/07/17 12:06:06 will work on package "wit", bug "831331"
2016/07/17 12:06:07 Skipping MIME part with invalid Content-Disposition header (mime: no media type)
2016/07/17 12:06:07 gbp clone --pristine-tar git+ssh://git.debian.org/git/collab-maint/wit.git /tmp/mergebot-743062986/repo
2016/07/17 12:06:09 git config push.default matching
2016/07/17 12:06:09 git config --add remote.origin.push +refs/heads/*:refs/heads/*
2016/07/17 12:06:09 git config --add remote.origin.push +refs/tags/*:refs/tags/*
2016/07/17 12:06:09 git config user.email stapelberg AT debian DOT org
2016/07/17 12:06:09 patch -p1 -i ../latest.patch
2016/07/17 12:06:09 git add .
2016/07/17 12:06:09 git commit -a --author Chris Lamb <lamby AT debian DOT org> --message Fix for  wit: please make the build reproducible  (Closes: #831331)
2016/07/17 12:06:09 gbp dch --release --git-author --commit
2016/07/17 12:06:09 gbp buildpackage --git-tag --git-export-dir=../export --git-builder=sbuild -v -As --dist=unstable
2016/07/17 12:07:16 Merge and build successful!
2016/07/17 12:07:16 Please introspect the resulting Debian package and git repository, then push and upload:
2016/07/17 12:07:16 cd "/tmp/mergebot-743062986"
2016/07/17 12:07:16 (cd repo && git push)
2016/07/17 12:07:16 (cd export && debsign *.changes && dput *.changes)
midna ~ $ cd /tmp/mergebot-743062986/repo
midna /tmp/mergebot-743062986/repo $ git log HEAD~2..
commit d983d242ee546b2249a866afe664bac002a06859
Author: Michael Stapelberg <stapelberg AT debian DOT org>
Date:   Sun Jul 17 13:32:41 2016 +0200
    Update changelog for 2.31a-3 release
commit 5a327f5d66e924afc656ad71d3bfb242a9bd6ddc
Author: Chris Lamb <lamby AT debian DOT org>
Date:   Sun Jul 17 13:32:41 2016 +0200
    Fix for  wit: please make the build reproducible  (Closes: #831331)
midna /tmp/mergebot-743062986/repo $ git push
Counting objects: 11, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (11/11), done.
Writing objects: 100% (11/11), 1.59 KiB   0 bytes/s, done.
Total 11 (delta 6), reused 0 (delta 0)
remote: Sending notification emails to: dispatch+wit_vcs@tracker.debian.org
remote: Sending notification emails to: dispatch+wit_vcs@tracker.debian.org
To git+ssh://git.debian.org/git/collab-maint/wit.git
   650ee05..d983d24  master -> master
 * [new tag]         debian/2.31a-3 -> debian/2.31a-3
midna /tmp/mergebot-743062986/repo $ cd ../export
midna /tmp/mergebot-743062986/export $ debsign *.changes && dput *.changes
[ ]
Uploading wit_2.31a-3.dsc
Uploading wit_2.31a-3.debian.tar.xz
Uploading wit_2.31a-3_amd64.deb
Uploading wit_2.31a-3_amd64.changes
Of course, this is not quite as convenient as clicking a Merge button yet. I have some ideas on how to make that happen, but I need to know whether people are interested before I spend more time on this. Please see github.com/Debian/mergebot for more details, and please get in touch if you think this is worthwhile or would even like to help. Feedback is accepted in the GitHub issue tracker for mergebot or the project mailing list mergebot-discuss. Thanks!

19 June 2016

Paul Tagliamonte: Go Debian!

As some of the world knows full well by now, I've been noodling with Go for a few years, working through its pros, its cons, and thinking a lot about how humans use code to express thoughts and ideas. Go's got a lot of neat use cases, suited to particular problems, and used in the right place, you can see some clear massive wins. I've started writing Debian tooling in Go, because it's a pretty natural fit. Go's fairly tight, and overhead shouldn't be taken up by your operating system. After a while, I wound up hitting the usual blockers, and started to build up abstractions. They became pretty darn useful, so, this blog post is announcing (a still incomplete, year old and perhaps API changing) Debian package for Go. The Go importable name is pault.ag/go/debian. This contains a lot of utilities for dealing with Debian packages, and will become an edited down "toolbelt" for working with or on Debian packages. Module Overview Currently, the package contains 4 major sub packages. They're a changelog parser, a control file parser, deb file format parser, dependency parser and a version parser. Together, these are a set of powerful building blocks which can be used together to create higher order systems with reliable understandings of the world. changelog The first (and perhaps most incomplete and least tested) is a changelog file parser.. This provides the programmer with the ability to pull out the suite being targeted in the changelog, when each upload was, and the version for each. For example, let's look at how we can pull when all the uploads of Docker to sid took place:
func main()  
    resp, err := http.Get("http://metadata.ftp-master.debian.org/changelogs/main/d/docker.io/unstable_changelog")
    if err != nil  
        panic(err)
     
    allEntries, err := changelog.Parse(resp.Body)
    if err != nil  
        panic(err)
     
    for _, entry := range allEntries  
        fmt.Printf("Version %s was uploaded on %s\n", entry.Version, entry.When)
     
 
The output of which looks like:
Version 1.8.3~ds1-2 was uploaded on 2015-11-04 00:09:02 -0800 -0800
Version 1.8.3~ds1-1 was uploaded on 2015-10-29 19:40:51 -0700 -0700
Version 1.8.2~ds1-2 was uploaded on 2015-10-29 07:23:10 -0700 -0700
Version 1.8.2~ds1-1 was uploaded on 2015-10-28 14:21:00 -0700 -0700
Version 1.7.1~dfsg1-1 was uploaded on 2015-08-26 10:13:48 -0700 -0700
Version 1.6.2~dfsg1-2 was uploaded on 2015-07-01 07:45:19 -0600 -0600
Version 1.6.2~dfsg1-1 was uploaded on 2015-05-21 00:47:43 -0600 -0600
Version 1.6.1+dfsg1-2 was uploaded on 2015-05-10 13:02:54 -0400 EDT
Version 1.6.1+dfsg1-1 was uploaded on 2015-05-08 17:57:10 -0600 -0600
Version 1.6.0+dfsg1-1 was uploaded on 2015-05-05 15:10:49 -0600 -0600
Version 1.6.0+dfsg1-1~exp1 was uploaded on 2015-04-16 18:00:21 -0600 -0600
Version 1.6.0~rc7~dfsg1-1~exp1 was uploaded on 2015-04-15 19:35:46 -0600 -0600
Version 1.6.0~rc4~dfsg1-1 was uploaded on 2015-04-06 17:11:33 -0600 -0600
Version 1.5.0~dfsg1-1 was uploaded on 2015-03-10 22:58:49 -0600 -0600
Version 1.3.3~dfsg1-2 was uploaded on 2015-01-03 00:11:47 -0700 -0700
Version 1.3.3~dfsg1-1 was uploaded on 2014-12-18 21:54:12 -0700 -0700
Version 1.3.2~dfsg1-1 was uploaded on 2014-11-24 19:14:28 -0500 EST
Version 1.3.1~dfsg1-2 was uploaded on 2014-11-07 13:11:34 -0700 -0700
Version 1.3.1~dfsg1-1 was uploaded on 2014-11-03 08:26:29 -0700 -0700
Version 1.3.0~dfsg1-1 was uploaded on 2014-10-17 00:56:07 -0600 -0600
Version 1.2.0~dfsg1-2 was uploaded on 2014-10-09 00:08:11 +0000 +0000
Version 1.2.0~dfsg1-1 was uploaded on 2014-09-13 11:43:17 -0600 -0600
Version 1.0.0~dfsg1-1 was uploaded on 2014-06-13 21:04:53 -0400 EDT
Version 0.11.1~dfsg1-1 was uploaded on 2014-05-09 17:30:45 -0400 EDT
Version 0.9.1~dfsg1-2 was uploaded on 2014-04-08 23:19:08 -0400 EDT
Version 0.9.1~dfsg1-1 was uploaded on 2014-04-03 21:38:30 -0400 EDT
Version 0.9.0+dfsg1-1 was uploaded on 2014-03-11 22:24:31 -0400 EDT
Version 0.8.1+dfsg1-1 was uploaded on 2014-02-25 20:56:31 -0500 EST
Version 0.8.0+dfsg1-2 was uploaded on 2014-02-15 17:51:58 -0500 EST
Version 0.8.0+dfsg1-1 was uploaded on 2014-02-10 20:41:10 -0500 EST
Version 0.7.6+dfsg1-1 was uploaded on 2014-01-22 22:50:47 -0500 EST
Version 0.7.1+dfsg1-1 was uploaded on 2014-01-15 20:22:34 -0500 EST
Version 0.6.7+dfsg1-3 was uploaded on 2014-01-09 20:10:20 -0500 EST
Version 0.6.7+dfsg1-2 was uploaded on 2014-01-08 19:14:02 -0500 EST
Version 0.6.7+dfsg1-1 was uploaded on 2014-01-07 21:06:10 -0500 EST
control Next is one of the most complex, and one of the oldest parts of go-debian, which is the control file parser (otherwise sometimes known as deb822). This module was inspired by the way that the json module works in Go, allowing for files to be defined in code with a struct. This tends to be a bit more declarative, but also winds up putting logic into struct tags, which can be a nasty anti-pattern if used too much. The first primitive in this module is the concept of a Paragraph, a struct containing two values, the order of keys seen, and a map of string to string. All higher order functions dealing with control files will go through this type, which is a helpful interchange format to be aware of. All parsing of meaning from the Control file happens when the Paragraph is unpacked into a struct using reflection. The idea behind this strategy that you define your struct, and let the Control parser handle unpacking the data from the IO into your container, letting you maintain type safety, since you never have to read and cast, the conversion will handle this, and return an Unmarshaling error in the event of failure. Additionally, Structs that define an anonymous member of control.Paragraph will have the raw Paragraph struct of the underlying file, allowing the programmer to handle dynamic tags (such as X-Foo), or at least, letting them survive the round-trip through go. The default decoder contains an argument, the ability to verify the input control file using an OpenPGP keyring, which is exposed to the programmer through the (*Decoder).Signer() function. If the passed argument is nil, it will not check the input file signature (at all!), and if it has been passed, any signed data must be found or an error will fall out of the NewDecoder call. On the way out, the opposite happens, where the struct is introspected, turned into a control.Paragraph, and then written out to the io.Writer. Here's a quick (and VERY dirty) example showing the basics of reading and writing Debian Control files with go-debian.
package main
import (
    "fmt"
    "io"
    "net/http"
    "strings"
    "pault.ag/go/debian/control"
)
type AllowedPackage struct  
    Package     string
    Fingerprint string
 
func (a *AllowedPackage) UnmarshalControl(in string) error  
    in = strings.TrimSpace(in)
    chunks := strings.SplitN(in, " ", 2)
    if len(chunks) != 2  
        return fmt.Errorf("Syntax sucks: '%s'", in)
     
    a.Package = chunks[0]
    a.Fingerprint = chunks[1][1 : len(chunks[1])-1]
    return nil
 
type DMUA struct  
    Fingerprint     string
    Uid             string
    AllowedPackages []AllowedPackage  control:"Allow" delim:"," 
 
func main()  
    resp, err := http.Get("http://metadata.ftp-master.debian.org/dm.txt")
    if err != nil  
        panic(err)
     
    decoder, err := control.NewDecoder(resp.Body, nil)
    if err != nil  
        panic(err)
     
    for  
        dmua := DMUA 
        if err := decoder.Decode(&dmua); err != nil  
            if err == io.EOF  
                break
             
            panic(err)
         
        fmt.Printf("The DM %s is allowed to upload:\n", dmua.Uid)
        for _, allowedPackage := range dmua.AllowedPackages  
            fmt.Printf("   %s [granted by %s]\n", allowedPackage.Package, allowedPackage.Fingerprint)
         
     
 
Output (truncated!) looks a bit like:
...
The DM Allison Randal <allison@lohutok.net> is allowed to upload:
   parrot [granted by A4F455C3414B10563FCC9244AFA51BD6CDE573CB]
...
The DM Benjamin Barenblat <bbaren@mit.edu> is allowed to upload:
   boogie [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   dafny [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   transmission-remote-gtk [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
   urweb [granted by 3224C4469D7DF8F3D6F41A02BBC756DDBE595F6B]
...
The DM     <aelmahmoudy@sabily.org> is allowed to upload:
   covered [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
   dico [granted by 6ADD5093AC6D1072C9129000B1CCD97290267086]
   drawtiming [granted by 41352A3B4726ACC590940097F0A98A4C4CD6E3D2]
   fonts-hosny-amiri [granted by BD838A2BAAF9E3408BD9646833BE1A0A8C2ED8FF]
   ...
...
deb Next up, we've got the deb module. This contains code to handle reading Debian 2.0 .deb files. It contains a wrapper that will parse the control member, and provide the data member through the archive/tar interface. Here's an example of how to read a .deb file, access some metadata, and iterate over the tar archive, and print the filenames of each of the entries.
func main()  
    path := "/tmp/fluxbox_1.3.5-2+b1_amd64.deb"
    fd, err := os.Open(path)
    if err != nil  
        panic(err)
     
    defer fd.Close()
    debFile, err := deb.Load(fd, path)
    if err != nil  
        panic(err)
     
    version := debFile.Control.Version
    fmt.Printf(
        "Epoch: %d, Version: %s, Revision: %s\n",
        version.Epoch, version.Version, version.Revision,
    )
    for  
        hdr, err := debFile.Data.Next()
        if err == io.EOF  
            break
         
        if err != nil  
            panic(err)
         
        fmt.Printf("  -> %s\n", hdr.Name)
     
 
Boringly, the output looks like:
Epoch: 0, Version: 1.3.5, Revision: 2+b1
  -> ./
  -> ./etc/
  -> ./etc/menu-methods/
  -> ./etc/menu-methods/fluxbox
  -> ./etc/X11/
  -> ./etc/X11/fluxbox/
  -> ./etc/X11/fluxbox/window.menu
  -> ./etc/X11/fluxbox/fluxbox.menu-user
  -> ./etc/X11/fluxbox/keys
  -> ./etc/X11/fluxbox/init
  -> ./etc/X11/fluxbox/system.fluxbox-menu
  -> ./etc/X11/fluxbox/overlay
  -> ./etc/X11/fluxbox/apps
  -> ./usr/
  -> ./usr/share/
  -> ./usr/share/man/
  -> ./usr/share/man/man5/
  -> ./usr/share/man/man5/fluxbox-style.5.gz
  -> ./usr/share/man/man5/fluxbox-menu.5.gz
  -> ./usr/share/man/man5/fluxbox-apps.5.gz
  -> ./usr/share/man/man5/fluxbox-keys.5.gz
  -> ./usr/share/man/man1/
  -> ./usr/share/man/man1/startfluxbox.1.gz
...
dependency The dependency package provides an interface to parse and compute dependencies. This package is a bit odd in that, well, there's no other library that does this. The issue is that there are actually two different parsers that compute our Dependency lines, one in Perl (as part of dpkg-dev) and another in C (in dpkg). To date, this has resulted in me filing three different bugs. I also found a broken package in the archive, which actually resulted in another bug being (totally accidentally) already fixed. I hope to continue to run the archive through my parser in hopes of finding more bugs! This package is a bit complex, but it basically just returns what amounts to be an AST for our Dependency lines. I'm positive there are bugs, so file them!
func main()  
    dep, err := dependency.Parse("foo   bar, baz, foobar [amd64]   bazfoo [!sparc], fnord:armhf [gnu-linux-sparc]")
    if err != nil  
        panic(err)
     
    anySparc, err := dependency.ParseArch("sparc")
    if err != nil  
        panic(err)
     
    for _, possi := range dep.GetPossibilities(*anySparc)  
        fmt.Printf("%s (%s)\n", possi.Name, possi.Arch)
     
 
Gives the output:
foo (<nil>)
baz (<nil>)
fnord (armhf)
version Right off the bat, I'd like to thank Michael Stapelberg for letting me graft this out of dcs and into the go-debian package. This was nearly entirely his work (with a one or two line function I added later), and was amazingly helpful to have. Thank you! This module implements Debian version comparisons and parsing, allowing for sorting in lists, checking to see if it's native or not, and letting the programmer to implement smart(er!) logic based on upstream (or Debian) version numbers. This module is extremely easy to use and very straightforward, and not worth writing an example for. Final thoughts This is more of a "Yeah, OK, this has been useful enough to me at this point that I'm going to support this" rather than a "It's stable!" or even "It's alive!" post. Hopefully folks can report bugs and help iterate on this module until we have some really clean building blocks to build solid higher level systems on top of. Being able to have multiple libraries interoperate by relying on go-debian will be a massive ease. I'm in need of more documentation, and to finalize some parts of the older sub package APIs, but I'm hoping to be at a "1.0" real soon now.

12 October 2015

Michael Stapelberg: ratt: Rebuild All The Things!

When uploading a new library package which changes its API/behavior in a subtle way, typically you will only hear about the downstream breakage after you ve uploaded the new library package (via bug reports telling you that your package FTBFS, fails to build from source). I prefer quality-assurance to happen proactively rather than reactively whenever possible, so I set out to write a tool which automates the rebuild of reverse-build-dependencies, i.e. all packages whose build process could be affected by the package one is about to upload. The result is a tool called ratt, which is short for Rebuild All The Things! . It injects the newly-built Debian package you provide it, figures out all the reverse-build-dependencies, invokes sbuild for all of them, and finally presents you with a list of packages that failed to build. To demonstrate how the tool works, let s assume we want to upload a new version of the Go library github.com/jacobsa/gcloud. To keep this example brief, we don t actually do anything to the package but bump its version number:
midna /tmp $ debcheckout golang-github-jacobsa-gcloud-dev
declared git repository at git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-gcloud.git
git clone git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-gcloud.git golang-github-jacobsa-gcloud-dev ...
Cloning into 'golang-github-jacobsa-gcloud-dev'...
remote: Counting objects: 84, done.
remote: Compressing objects: 100% (73/73), done.
remote: Total 84 (delta 19), reused 0 (delta 0)
Receiving objects: 100% (84/84), 87.70 KiB   0 bytes/s, done.
Resolving deltas: 100% (19/19), done.
Checking connectivity... done.
midna /tmp $ cd golang-github-jacobsa-gcloud-dev
midna /tmp/golang-github-jacobsa-gcloud-dev $ dch -i -m 'dummy new version'
midna /tmp/golang-github-jacobsa-gcloud-dev $ git commit -a -m 'dummy new version'
midna /tmp/golang-github-jacobsa-gcloud-dev $ gbp buildpackage --git-pbuilder
[ ]
So far, so good. Now let s invoke ratt to rebuild all reverse-build-dependencies:
midna /tmp/golang-github-jacobsa-gcloud-dev $ ratt golang-github-jacobsa-gcloud_0.0\~git20150709-2_amd64.changes         
2015/08/16 11:48:41 Loading changes file "golang-github-jacobsa-gcloud_0.0~git20150709-2_amd64.changes"
2015/08/16 11:48:41  - 1 binary packages: golang-github-jacobsa-gcloud-dev
2015/08/16 11:48:41  - corresponding .debs (will be injected when building):
2015/08/16 11:48:41     golang-github-jacobsa-gcloud-dev_0.0~git20150709-2_all.deb
2015/08/16 11:48:41 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_contrib_source_Sources"
2015/08/16 11:48:41 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_main_source_Sources"
2015/08/16 11:48:43 Loading sources index "/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_sid_non-free_source_Sources"
2015/08/16 11:48:43 Building golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1 (commandline: [sbuild --arch-all --dist=sid --nolog golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1 --extra-package=golang-github-jacobsa-gcloud-dev_0.0~git20150709-2_all.deb])
2015/08/16 11:49:19 Build results:
2015/08/16 11:49:19 PASSED: golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1
Thanks for reading this far, and I hope ratt makes your life a tiny bit easier. As ratt just entered Debian unstable, you can install it using apt-get install ratt. If you have any feedback, I m eager to hear it.

9 August 2015

Simon Kainz: DUCK challenge: week 5

Slighthly delayed, but here are the stats for week 5 of the DUCK challenge: So we had 10 packages fixed and uploaded by 10 different uploaders. A big "Thank You" to you!! Since the start of this challenge, a total of 59 packages, were fixed. Here is a quick overview:
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7
# Packages 10 15 10 14 10 - -
Total 10 25 35 49 59 - -
The list of the fixed and updated packages is availabe here. I will try to update this ~daily. If I missed one of your uploads, please drop me a line. Only 2 more weeks to DebConf15 so please get involved: The DUCK Challenge is running until end of DebConf15! Pevious articles are here: Week 1, Week 2, Week 3, Week 4.

27 July 2015

Michael Stapelberg: dh-make-golang: creating Debian packages from Go packages

Recently, the pkg-go team has been quite busy, uploading dozens of Go library packages in order to be able to package gcsfuse (a user-space file system for interacting with Google Cloud Storage) and InfluxDB (an open-source distributed time series database). Packaging Go library packages (!) is a fairly repetitive process, so before starting my work on the dependencies for gcsfuse, I started writing a tool called dh-make-golang. Just like dh-make itself, the goal is to automatically create (almost) an entire Debian package. As I worked my way through the dependencies of gcsfuse, I refined how the tool works, and now I believe it s good enough for a first release. To demonstrate how the tool works, let s assume we want to package the Go library github.com/jacobsa/ratelimit:
midna /tmp $ dh-make-golang github.com/jacobsa/ratelimit
2015/07/25 18:25:39 Downloading "github.com/jacobsa/ratelimit/..."
2015/07/25 18:25:53 Determining upstream version number
2015/07/25 18:25:53 Package version is "0.0~git20150723.0.2ca5e0c"
2015/07/25 18:25:53 Determining dependencies
2015/07/25 18:25:55 
2015/07/25 18:25:55 Packaging successfully created in /tmp/golang-github-jacobsa-ratelimit
2015/07/25 18:25:55 
2015/07/25 18:25:55 Resolve all TODOs in itp-golang-github-jacobsa-ratelimit.txt, then email it out:
2015/07/25 18:25:55     sendmail -t -f < itp-golang-github-jacobsa-ratelimit.txt
2015/07/25 18:25:55 
2015/07/25 18:25:55 Resolve all the TODOs in debian/, find them using:
2015/07/25 18:25:55     grep -r TODO debian
2015/07/25 18:25:55 
2015/07/25 18:25:55 To build the package, commit the packaging and use gbp buildpackage:
2015/07/25 18:25:55     git add debian && git commit -a -m 'Initial packaging'
2015/07/25 18:25:55     gbp buildpackage --git-pbuilder
2015/07/25 18:25:55 
2015/07/25 18:25:55 To create the packaging git repository on alioth, use:
2015/07/25 18:25:55     ssh git.debian.org "/git/pkg-go/setup-repository golang-github-jacobsa-ratelimit 'Packaging for golang-github-jacobsa-ratelimit'"
2015/07/25 18:25:55 
2015/07/25 18:25:55 Once you are happy with your packaging, push it to alioth using:
2015/07/25 18:25:55     git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
The ITP is often the most labor-intensive part of the packaging process, because any number of auto-detected values might be wrong: the repository owner might not be the Upstream Author , the repository might not have a short description, the long description might need some adjustments or the license might not be auto-detected.
midna /tmp $ cat itp-golang-github-jacobsa-ratelimit.txt
From: "Michael Stapelberg" <stapelberg AT debian.org>
To: submit@bugs.debian.org
Subject: ITP: golang-github-jacobsa-ratelimit -- Go package for rate limiting
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Package: wnpp
Severity: wishlist
Owner: Michael Stapelberg <stapelberg AT debian.org>
* Package name    : golang-github-jacobsa-ratelimit
  Version         : 0.0~git20150723.0.2ca5e0c-1
  Upstream Author : Aaron Jacobs
* URL             : https://github.com/jacobsa/ratelimit
* License         : Apache-2.0
  Programming Lang: Go
  Description     : Go package for rate limiting
 GoDoc (https://godoc.org/github.com/jacobsa/ratelimit)
 .
 This package contains code for dealing with rate limiting. See the
 reference (http://godoc.org/github.com/jacobsa/ratelimit) for more info.
TODO: perhaps reasoning
midna /tmp $
After filling in all the TODOs in the file, let s mail it out and get a sense of what else still needs to be done:
midna /tmp $ sendmail -t -f < itp-golang-github-jacobsa-ratelimit.txt
midna /tmp $ cd golang-github-jacobsa-ratelimit
midna /tmp/golang-github-jacobsa-ratelimit master $ grep -r TODO debian
debian/changelog:  * Initial release (Closes: TODO) 
midna /tmp/golang-github-jacobsa-ratelimit master $
After filling in these TODOs as well, let s have a final look at what we re about to build:
midna /tmp/golang-github-jacobsa-ratelimit master $ head -100 debian/**/*
==> debian/changelog <==                            
golang-github-jacobsa-ratelimit (0.0~git20150723.0.2ca5e0c-1) unstable; urgency=medium
  * Initial release (Closes: #793646)
 -- Michael Stapelberg <stapelberg@debian.org>  Sat, 25 Jul 2015 23:26:34 +0200
==> debian/compat <==
9
==> debian/control <==
Source: golang-github-jacobsa-ratelimit
Section: devel
Priority: extra
Maintainer: pkg-go <pkg-go-maintainers@lists.alioth.debian.org>
Uploaders: Michael Stapelberg <stapelberg@debian.org>
Build-Depends: debhelper (>= 9),
               dh-golang,
               golang-go,
               golang-github-jacobsa-gcloud-dev,
               golang-github-jacobsa-oglematchers-dev,
               golang-github-jacobsa-ogletest-dev,
               golang-github-jacobsa-syncutil-dev,
               golang-golang-x-net-dev
Standards-Version: 3.9.6
Homepage: https://github.com/jacobsa/ratelimit
Vcs-Browser: http://anonscm.debian.org/gitweb/?p=pkg-go/packages/golang-github-jacobsa-ratelimit.git;a=summary
Vcs-Git: git://anonscm.debian.org/pkg-go/packages/golang-github-jacobsa-ratelimit.git
Package: golang-github-jacobsa-ratelimit-dev
Architecture: all
Depends: $ shlibs:Depends ,
         $ misc:Depends ,
         golang-go,
         golang-github-jacobsa-gcloud-dev,
         golang-github-jacobsa-oglematchers-dev,
         golang-github-jacobsa-ogletest-dev,
         golang-github-jacobsa-syncutil-dev,
         golang-golang-x-net-dev
Built-Using: $ misc:Built-Using 
Description: Go package for rate limiting
 This package contains code for dealing with rate limiting. See the
 reference (http://godoc.org/github.com/jacobsa/ratelimit) for more info.
==> debian/copyright <==
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: ratelimit
Source: https://github.com/jacobsa/ratelimit
Files: *
Copyright: 2015 Aaron Jacobs
License: Apache-2.0
Files: debian/*
Copyright: 2015 Michael Stapelberg <stapelberg@debian.org>
License: Apache-2.0
Comment: Debian packaging is licensed under the same terms as upstream
License: Apache-2.0
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 .
 http://www.apache.org/licenses/LICENSE-2.0
 .
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 .
 On Debian systems, the complete text of the Apache version 2.0 license
 can be found in "/usr/share/common-licenses/Apache-2.0".
==> debian/gbp.conf <==
[DEFAULT]
pristine-tar = True
==> debian/rules <==
#!/usr/bin/make -f
export DH_GOPKG := github.com/jacobsa/ratelimit
%:
	dh $@ --buildsystem=golang --with=golang
==> debian/source <==
head: error reading  debian/source : Is a directory
==> debian/source/format <==
3.0 (quilt)
midna /tmp/golang-github-jacobsa-ratelimit master $
Okay, then. Let s give it a shot and see if it builds:
midna /tmp/golang-github-jacobsa-ratelimit master $ git add debian && git commit -a -m 'Initial packaging'
[master 48f4c25] Initial packaging                                                      
 7 files changed, 75 insertions(+)
 create mode 100644 debian/changelog
 create mode 100644 debian/compat
 create mode 100644 debian/control
 create mode 100644 debian/copyright
 create mode 100644 debian/gbp.conf
 create mode 100755 debian/rules
 create mode 100644 debian/source/format
midna /tmp/golang-github-jacobsa-ratelimit master $ gbp buildpackage --git-pbuilder
[ ]
midna /tmp/golang-github-jacobsa-ratelimit master $ lintian ../golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes
I: golang-github-jacobsa-ratelimit source: debian-watch-file-is-missing
P: golang-github-jacobsa-ratelimit-dev: no-upstream-changelog
I: golang-github-jacobsa-ratelimit-dev: extended-description-is-probably-too-short
midna /tmp/golang-github-jacobsa-ratelimit master $
This package just built (as it should!), but occasionally one might need to disable a test and file an upstream bug about it. So, let s push this package to pkg-go and upload it:
midna /tmp/golang-github-jacobsa-ratelimit master $ ssh git.debian.org "/git/pkg-go/setup-repository golang-github-jacobsa-ratelimit 'Packaging for golang-github-jacobsa-ratelimit'"
Initialized empty shared Git repository in /srv/git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git/
HEAD is now at ea6b1c5 add mrconfig for dh-make-golang
[master c5be5a1] add mrconfig for golang-github-jacobsa-ratelimit
 1 file changed, 3 insertions(+)
To /git/pkg-go/meta.git
   ea6b1c5..c5be5a1  master -> master
midna /tmp/golang-github-jacobsa-ratelimit master $ git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
Counting objects: 31, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (25/25), done.
Writing objects: 100% (31/31), 18.38 KiB   0 bytes/s, done.
Total 31 (delta 2), reused 0 (delta 0)
To git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git
 * [new branch]      master -> master
 * [new branch]      pristine-tar -> pristine-tar
 * [new branch]      upstream -> upstream
 * [new tag]         upstream/0.0_git20150723.0.2ca5e0c -> upstream/0.0_git20150723.0.2ca5e0c
midna /tmp/golang-github-jacobsa-ratelimit master $ cd ..
midna /tmp $ debsign golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes
[ ]
midna /tmp $ dput golang-github-jacobsa-ratelimit_0.0\~git20150723.0.2ca5e0c-1_amd64.changes   
Uploading golang-github-jacobsa-ratelimit using ftp to ftp-master (host: ftp.upload.debian.org; directory: /pub/UploadQueue/)
[ ]
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1.dsc
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c.orig.tar.bz2
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1.debian.tar.xz
Uploading golang-github-jacobsa-ratelimit-dev_0.0~git20150723.0.2ca5e0c-1_all.deb
Uploading golang-github-jacobsa-ratelimit_0.0~git20150723.0.2ca5e0c-1_amd64.changes
midna /tmp $ cd golang-github-jacobsa-ratelimit 
midna /tmp/golang-github-jacobsa-ratelimit master $ git tag debian/0.0_git20150723.0.2ca5e0c-1
midna /tmp/golang-github-jacobsa-ratelimit master $ git push git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git --tags master pristine-tar upstream
Total 0 (delta 0), reused 0 (delta 0)
To git+ssh://git.debian.org/git/pkg-go/packages/golang-github-jacobsa-ratelimit.git
 * [new tag]         debian/0.0_git20150723.0.2ca5e0c-1 -> debian/0.0_git20150723.0.2ca5e0c-1
midna /tmp/golang-github-jacobsa-ratelimit master $
Thanks for reading this far, and I hope dh-make-golang makes your life a tiny bit easier. As dh-make-golang just entered Debian unstable, you can install it using apt-get install dh-make-golang. If you have any feedback, I m eager to hear it.

23 December 2014

Michael Stapelberg: Debian Code Search: taming the latency tail

It s been a couple of weeks since I ve launched Debian Code Search Instant, so people have had the chance to use it for a while and that gives me plenty of data points to look at :-). For every query, I log the search term itself as well as the duration the query took to execute. That way, I can easily identify queries that take a long time and see why that is. There is a class of queries for which Debian Code Search (DCS) doesn t perform so well, and that s queries that consist of trigrams which are extremely common. Whenever DCS receives such a query, it needs to search through a lot of files. Note that it doesn t really matter if there are plenty of results or not it s the number of files that potentially contain a result which matters. One such query is arse (we get a lot of curse words). It consists of only two trigrams ( ars and rse ), which are extremely common in program source code. As a couple of examples, the terms parse , sparse , charset and coarse are all matched by that. As an aside, if you really want to search for just arse , use word boundaries, i.e. \barse\b , which also makes the query significantly faster. Fixing the overloaded frontend When DCS first received the query, arse would lead to our frontend server crashing. That was due to (intentionally) unoptimized code we were aggregating all search results from all 6 source backends in memory, sorted them, and then wrote them out to disk. I addressed this in commit d2922fe92 with the following measures:
  1. Instead of keeping the entire result in memory, just write the result to a temporary file on disk ( unsorted.json ) and store pointers into that file in memory, i.e. (offset, length) tuples. In order to sort the results, we also need to store the ranking and the path (to resolve ties and thereby guarantee a stable result order over multiple search queries). For grouping the results by source package, we need to keep the package name.
  2. If you think about it, you don t need the entire path in order to break a tie the hash is enough, as it defines an ordering. That ordering may be different, but any ordering is good enough for the purpose of merely breaking a tie in a deterministic way. I m using Go s hash/fnv, the only non-cryptographic (fast!) hash function that is included in Go s standard library.
  3. Since this was still leading to Out Of Memory errors, I decided to not store a copy of the package name in each search result, but rather use a pointer into a string pool containing all package names. The number of source package names is relatively small, in the order of 20,000, whereas the number of search results can be in the millions. Using the stringpool is a clear win the overhead in the case where #results < #srcpackages is negligible, but as soon as #results > #srcpackages, you save memory.
With all of that fixed, the query became at all possible, albeit with a runtime of around 20 minutes. Double Writing When running such a long-running query, I noticed that the query ran smooth for a while, but then it took multiple seconds without any visible progress at the end of the query before the results appeared. This was due to the frontend ranking the results and then converting unsorted.json into actual result pages. Since we provide results ordered by ranking, but also results grouped by source packages, it was writing every result twice to disk. What s even worse is that due to re-ordering, every read was essentially random (as opposed to sequential reads). What s even worse is that nobody will ever click through all the hundreds of thousands of result pages, so they are prepared entirely in vain. Therefore, with commit c744b236e I made the frontend generate these result pages on demand. This cut down the time for the ranking phase at the end of each query from 20-30 seconds (for big queries) to typically less than one second. Profiling/Monitoring After adding monitoring to each of the source-backends, I realized that during these long-running queries, the disk I/O and network I/O was nowhere near my expectations: each source-backend was sending only a low single-digit number of megabytes per second back to the frontend (typically somewhere between 1 MB/s and 3 MB/s). This didn t match up at all with the bandwidth I observed in earlier performance tests, so I used wget -O /dev/null to send a query and discard the result in order to get some theoretical performance numbers. Suddenly, I was getting more than 10 MB/s worth of results, maxing out the disks with a read rate of about 200 MB/s. So where is the bottleneck? I double-checked that neither the CPU on any of our VMs, nor the network between them was saturated. Note that as of this point, the CPU of the frontend was at roughly 70% (of one core), which didn t seem a lot to me. Then, I followed this excellent tutorial on profiling Go programs to see where the frontend is spending its time. Turns out, the biggest consumer of CPU time was the encoding/json Go package, which is used for deserializing results received from the backend and serializing them again before sending them to the client. Since I was curious about it for a while already, I decided to give cap n proto a try to replace JSON as serialization mechanism for communication between the source backends and the frontend. Switching to it (commit 8efd3b41) brought down the CPU load immensely, and made the query a bit faster. In addition, I killed the next biggest consumer: the lseek(2) syscall, which we used to call with SEEK_CUR and an offset of 0 so that it would tell us the current position. This was necessary in the first place because we don t know in advance how many bytes we re going to write when serializing a result to disk. The replacement is a neat little trick:
type countingWriter int64
func (c *countingWriter) Write(p []byte) (n int, err error)  
    *c += countingWriter(len(p))
    return len(p), nil
 
// [ ]
// Then, use an io.MultiWriter like this:
var written countingWriter
w := io.MultiWriter(s.tempFiles[backendidx], &written)
result.WriteJSON(w)
With some more profiling, the new bottleneck was suddenly the read(2) syscall, issued by the cap n proto deserialization, operating directly on the network connection buffer. strace revealed that crunching through the results of one source backend for a long query, read(2) was called about 250,000 times. By simply using a buffered reader (commit 684467ae), I could reduce that to about 2,000 times. Another bottleneck was the fact that for every processed result, the frontend needed to update the query state, which is shared amongst all goroutines (there is one goroutine for each source backend). All that parallelism isn t very effective if you need to synchronize the state updates in the end. So with commit 5d46a572, I refactored the state to be per-backend, so that locking is only necessary for the first couple of results, and the vast vast majority of results can be processed entirely without locking. This brought down the query time from 20 minutes to about 5 minutes, but I still wasn t happy with the bandwidth: the frontend was doing a bit over 10 MB/s of reads from all source backends combined, whereas with wget I could get around 40 MB/s with the same query. At this point, the CPU utilization was around 7% of one core on the frontend, and profiling didn t immediately reveal an obvious culprit. After a bit of experimenting (by commenting out code here and there ;-)), I figured out that the problem was that the frontend was still converting these results from capnproto buffers to JSON. While that doesn t take a lot of CPU time, it delays the network stream from the source-backend: once the local and remote TCP buffers are full, the source-backend will (intentionally!) not continue with its search, so that it doesn t run out of memory. I m still convinced that s a good idea, and in fact I was able to solve the problem in an elegant way: instead of writing JSON to disk and generating result pages on demand, we now write capnproto directly to disk (commit 466b7f3e) and convert it to JSON only before sending out the result pages. That decreases the overall CPU time since we only need to convert a small fraction of the results to JSON, but most importantly, the frontend is now not in the critical path anymore. It can directly pass the data through, and in fact it uses an io.TeeReader to do exactly that. Conclusion With all of these optimizations, we re now down to about 2.5 minutes for the search query arse , and the architecture of the system actually got simpler to reason about. Most importantly, though, the optimizations don t only play out for a single query, but for many different queries. I ve deployed the optimized version at the 15th of December 2014, and you can see that the 99th, 95th and 90th percentile latency dropped significantly, i.e. there are a lot fewer spikes than before, and more queries are processed faster, which is particularly obvious in the third graph (which is capped at 2 minutes):


3 December 2014

Michael Stapelberg: Announcing the new Debian Code Search Instant

For the last few months, I have been working on a new version of Debian Code Search, and today it s going live! I call it Debian Code Search Instant, for multiple reasons, see below. A lot faster The new Debian Code Search is still hosted by Rackspace (thank you!), but our new architecture finally allows us to use their Performance Cloud Servers hardware generation. The old code search architecture was split across 5 different servers, however the search itself was only performed on a single machine with a network attached block volume backed by SSD. The new Code Search spreads out both the trigram index and the source code onto 6 different servers, and thanks to the new hardware generation, we have a locally connected SSD drive in each of them. So, we now get more than 6 times the IOPS we ve had before, at much lower latency :). Grouping by Debian source package The first feature request we ever got for Code Search was to group results by Debian source package. I ve tried tackling that before, but it s a pretty hard problem. With the new architecture and capacity, this finally becomes feasible. After your query completes, there is a checkbox called Group search results by Debian source package . Enable it, and you ll get neatly grouped results. For each package, there is a link at the bottom to refine the search to only consider that package. In case you are more interested in the full list of packages, for example because you are doing large-scale changes across Debian, that s also possible: Click on the symbol right next to the Filter by package list, and a curl command will be revealed with which you can download the full list of packages. Expensive queries (with lots of results) now possible Previously, we had a 60 second timeout during which queries must complete. This timeout has been completely lifted, and long-running queries with tons and tons of results are now possible. We kindly ask you to not abuse this feature it s not very exciting to play with complexity explosion in regular expression engines by preventing others from doing their work. If you re interested in that sort of analysis, go grab the source code and play with it on your own machine :). Instant results While your query is running, you will almost immediately see the top 10 results, even though the search may still take a while. This is useful to figure out if your regular expression matches what you thought it would, before having to wait until your 5 minute query is finished. Also, the new progress bar tells you how far the system is with your search. Instant indexing Previously, Code Search was deployed to a new set of machines from scratch every week in order to update the index. This was necessary because the performance would severely degrade while building the index, so we were temporarily running with twice the amount of machines until the new version was up. In the new architecture, we store an index for each source package and then merge these into one big index shard. This currently takes about 4 minutes with the code I wrote, but I m sure this can be made even faster if necessary. So, whenever new packages are uploaded to the Debian archive, we can just index the new version and trigger a merge. We get notifications about new package uploads from FedMsg. Packages that are not seen on FedMsg for some reason are backfilled every hour. The time between uploading a package and being able to find it in Debian Code Search therefore now ranges from a couple of minutes to about an hour, instead of about a week! New, beautiful User Interface Since we needed to rewrite the User Interface anyway thanks to the new architecture, we also spent some effort on making it modern, polished and beautiful. Smooth CSS animations make you aware of what s going on, and the search results look cleaner than ever. Conclusion At least to me, the new Debian Code Search seems like a significant improvement over the old one. I hope you enjoy the new features into which I put a lot of work. In case you have any feedback, I m happy to hear it (see the contact link at the bottom of every Code Search site).

9 November 2014

Michael Stapelberg: Upcoming Debian Code Search features

I ve been working on a significant rearchitecture of Debian Code Search during the last few months (off and on, as time permits). This will enable us to provide a couple of features that have been often requested but were not possible with the old architecture, such as grouping search results by Debian source package (#1 feature request) and performing queries that take a longer time than the default execution limit of a minute (#2 feature request). Oh, and it also will be a lot faster :-). All of this (unfortunately?) also requires a redesign of the user interface as the old way of displaying results is just fundamentally incompatible with the new architecture. This is why it s taking longer than I d like (never redesign both backend and frontend at the same time), but I m implementing various nice improvements along the way and I think the result will be worth the wait. Lea has helped me a lot re-designing the visual appearence of the search results page, and the new look (of the upcoming group by package feature!) is what I want to share in this blog post. If you have any comments, please let me know (via email to my debian address, stapelberg@). Enjoy:

11 April 2014

Ian Campbell: qcontrol 0.5.3

Update: Closely followed by 0.5.4 to fix an embarassing brown paper bag bug: Get it from gitorious or http://www.hellion.org.uk/qcontrol/releases/0.5.4/. I've just released qcontrol 0.5.3. Changes since the last release: Get it from gitorious or http://www.hellion.org.uk/qcontrol/releases/0.5.3/. The Debian package will be uploaded shortly.

19 February 2014

Michael Stapelberg: Using your TPM for SSH authentication

Thomas Habets has blogged about using your TPM (Trusted Platform Module) for SSH authentication a few weeks ago. We worked together to get his package simple-tpm-pk11 into Debian, and it has just arrived in unstable :-). Using simple-tpm-pk11, you can let your TPM generate a key, which you then can use for SSH authentication. This key will never leave the TPM, so it is safer than having your key on the filesystem (e.g. ~/.ssh/id_rsa), since file system access is not enough to steal your key anymore. Instead, you ll need remote code execution. To use this software, first make sure your TPM is enabled in the BIOS. In my ThinkPad X200 from 2008, the TPM is called Security Chip . Afterwards, claim ownership of your TPM using tpm_takeownership -z (from the tpm-tools package) and enter a password. You will not need to enter this password for every SSH authentication later (but you may choose to set a separate password for that). Then, install simple-tpm-pk11, create a key, set it as your PKCS11Provider and install the public key on the host(s) where you want to use it:
mkdir ~/.simple-tpm-pk11
stpm-keygen -o ~/.simple-tpm-pk11/my.key
echo key my.key > ~/.simple-tpm-pk11/config
echo -e "\nHost *\n    PKCS11Provider libsimple-tpm-pk11.so" >> ~/.ssh/config
ssh-keygen -D libsimple-tpm-pk11.so   ssh shell.example.com tee -a .ssh/authorized_keys
You ll now be able to ssh into shell.example.com without having the key for that on your file system :-). In case you have any feedback about/troubles with the software, please feel free to contact Thomas directly.

17 January 2014

Michael Stapelberg: debmirror 6x speed-up, or how to fill a 1 GBps Rackspace link

As explained in more detail in my my last blog post, Rackspace is providing hosting for Debian Code Search. For those of you who don t know, Rackspace is a cloud company that provides (among other services) a public cloud based on OpenStack. That means you can easily (and programmatically, if you want) bring up virtual servers, block storage volumes, configure the network between them, etc. As part of my initial performance experiments, I was running debmirror to clone a full Debian source mirror, and I noticed this took about one hour. Given that the peak network and storage write rates I have observed in my benchmarks are much higher, I wondered why it took so long. About 43 GB (that s how big the Debian sid sources are) in 60 minutes means 12 MB/s download rate, which is a bit more than a sustained 100 MBit/s connection. Luckily, at least the Rackspace servers I looked at are connected with 1 GBit/s to the internet, so more should be possible. Note that these are the old Rackspace servers. There is a new generation of servers, which I have not yet tried, that apparently offer even higher performance. After a brief look at the debmirror source code, I concluded that it can only use a single mirror and downloads files sequentially. There is some obvious potential for improvement here, and the fact that I could come up with a proof of concept written in Go to determine the files to download in a couple of minutes encouraged me to spend my saturday on this problem :-). About 7 hours later, my prototype had gone through various iterations and the code could sustain about 115 MB/s incoming bandwidth for most of the time. Here is a screenshot of dstat measuring the performance: Another 3 hours later, the most obvious bugs were weeded out and the code successfully cloned an entire mirror for the first time. I verified the correctness by running debmirror afterwards, and no files were downloaded that had not actually arrived on the mirror in the meantime. Features
  1. Parallel downloading from multiple mirrors.
    Of course, it is not guaranteed that the mirrors are consistent, but that is a solvable problem: whenever a non-200 HTTP response is encountered, the file is put back into the queue and gets rescheduled onto any mirror. Eventually, it will be rescheduled onto the mirror from which the sources.gz was downloaded, and that mirror has to have it.
  2. HTTP keep-alive and pipelining.
    debmirror also does HTTP keep-alive (I read that in the source, did not verify it), but it downloads files sequentially, i.e. it downloads one file, then sends the request for the next file. This means that after one file was received, no data is being received until another round-trip finished. For a lot of small files (think of .dsc and .debian.tar.gz files or small orig tarballs) that really adds up.
    Therefore, I use HTTP pipelining: I send 99 requests and then just receive all the responses at full speed. The magic number 99 comes from the fact that this is the smallest common denominator that all mirrors I have tested allow. nginx by default allows 100 requests.
  3. Request ordering.
    Instead of sending requests at random, I sort the download queue by size and make sure that the first request on each connection is for a big file. The intention is that TCP will adapt to that quickly and use a bigger window size as soon as possible instead of ramping up slower later on. Note that I have not actually measured the window sizes, so this is just a hunch.
    Another nice side-effect of the ordering is that the amount of data that is being sent through each connection is roughly equal. This means we don t waste TCP connections by sending 99 requests for small files only.
  4. (Goroutines).
    I m a bit hesitant to even write about it, so let s make it quick: Perl (and thus debmirror) is typically single-threaded, whereas Goroutines scale nicely on multi-core systems. I don t think it makes a big difference on the machines I am running this on as they have only two cores and are not maxed out on CPU usage at all. But maybe this will be more important for saturating 10 GBit/s? :-)
  5. (Mirror selection).
    It s not a feature of the code, but certainly relevant: of course I selected mirrors that are connected to the internet with at least a Gigabit link, serve files reasonably fast and have low latency to Rackspace s Chicago presence.
    Interestingly, while I used to think that universities have access to big uplinks and beefy machines, none of the mirrors that I use are located at a university. In fact, every mirror ending in .edu that I tried was really slow most of them providing something like 2-4 MB/s.
Results With the latest iteration of the code, I can clone an entire Debian source mirror with 43 GB of data in about 11.5 minutes, which corresponds to a 63 MB/s download rate:
GOMAXPROCS=20 ./dcs-debmirror -num_workers=20
68,24s user 384,90s system 66% cpu 11:25,97 total
It should be noted that the download rate is very low in the first couple of seconds since the sources.gz file is downloaded from a single mirror, then unpacked and analyzed. The peak download rate is about 115 MB/s (= 920 MBit/s) which is reasonably close at what you can achieve with a Gigabit link, I think. If the entire uplink was available to me at all time, the Rackspace hardware would be able to saturate that easily, both in terms of reading from the network and in terms of writing to block storage. I tested this on an SSD volume, but I see about 113 MB/s throughput with the internal hard disk, so I think that should work, too. There is another dstat screenshot of the final version (writing to disk). Perhaps even more interesting to some readers is the time for an incremental update:
GOMAXPROCS=20 ./dcs-debmirror -tcp_conns=20
2013/10/13 11:09:55 Downloading 307 files (447 MB total)
 
2013/10/13 11:10:04 All 307 files downloaded in 9.105605931s. Download rate is 49.186567 MB/s
4,91s user 3,99s system 67% cpu 13,205 total
The wall clock time is higher than the time reported by the code because the code does not count the time for downloading and parsing the sources.gz file. The entire program is about 400 lines (not SLOC) of Go code. It s part of the Debian Code Search source. If you re interested, you can read dcs-debmirror.go in your browser. Conclusions The outcome of this experiment is that I now know (and have shown!) that there are significantly more efficient ways of cloning a Debian mirror than what debmirror does. Furthermore, I have a good grasp on what kind of performance the Rackspace cloud offers and I am fairly happy with the numbers :-). My code is useful to me in context of Debian Code Search, but unless you need a sid-only source-only mirror, it will not be useful to you directly. Of course you can take the ideas that I implemented and implement them elsewhere personally, I don t plan to do that. If you have hardware, bandwidth and a use-case for 10 GBit/s+ mirroring, I d like to hear about it! :-)

Next.

Previous.